Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow for fingerprint() to hash full files if a .fullhash file is present in the directory #195

Conversation

0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q
Copy link
Member

We have a couple of sources consisting of stylised parameters across a bunch of scenarios

$ ls -l /p/projects/rd3mod/inputdata/sources/industry_subsectors_specific/
total 66
-rw-rw-r-- 2 pehl rdev  494 Apr 21  2022 industry_specific_FE_limits.csv
-rw-rw-r-- 1 pehl rdev 4054 Aug 19  2022 specific_FE.csv
-rw-rw-r-- 1 pehl rdev 1017 Jun 17  2022 specific_material_alpha.csv
-rw-rw-r-- 1 pehl rdev  375 Mar 20  2023 specific_material_relative_change.csv
-rw-rw-r-- 1 pehl rdev 6021 Aug 19  2022 specific_material_relative.csv

It is entirely possible that these parameters are modified

  1. without changing the file size, if a n-digit numbers are simply replaced by other n-digit numbers, or if the changes happen to add and remove the same number of characters, and
  2. without changing anything in the first 300 bytes of the files, which in the case of industry_specific_FE_limits.csv are consumed entirely by the comment header and part of the csv header:
    $ head -c 300 industry_specific_FE_limits.csv 
    # Thermodynamic limits on industry specific FE consumption by Silvia Madeddu
    # (see post https://mattermost.pik-potsdam.de/rd3/pl/u7eg6ed5gpr85rabznepnaoqrr
    # and https://mattermost.pik-potsdam.de/rd3/pl/g74og14a7igi8n6trjbhgcntrc).
    # GJ/t for absolute subsectors, share for relative subsectors
    subse
    
    and in the case of specific_material_relative.csv make up just of five of the 136 data lines.

Analogous to this, if a file .fullhash is present in the directory, the entire files will be hashed, not just the first 300 bytes.

Copy link
Member

@tscheypidi tscheypidi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the delay. Looks good to me.

@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q 0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q merged commit e74aca1 into pik-piam:master Dec 11, 2023
4 checks passed
@0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q 0UmfHxcvx5J7JoaOhFSs5mncnisTJJ6q deleted the dev/allow_for_full_hashing branch December 11, 2023 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants